Speech recognition in noisy environments using a switching linear dynamic model for feature enhancement

نویسندگان

  • Björn W. Schuller
  • Martin Wöllmer
  • Tobias Moosmayr
  • Gerhard Rigoll
چکیده

The performance of automatic speech recognition systems strongly decreases whenever the speech signal is disturbed by background noise. We aim to improve noise robustness focusing on all major levels of speech recognition: feature extraction, feature enhancement, and speech modeling. Different auditory modeling concepts, speech enhancement techniques, training strategies, and model architectures are implemented in an in-car digit and spelling recognition task. We prove that joint speech and noise modeling with a global Switching Linear Dynamic Model (SLDM) capturing the dynamics of speech, and a Linear Dynamic Model (LDM) for noise, prevails over state-of-theart speech enhancement techniques. Furthermore we show that the baseline recognizer of the Interspeech Consonant Challenge 2008 can be outperformed by SLDM feature enhancement for almost all of the noisy testsets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noisy Speech Feature Estimation on the Aurora2 Database using a Switching Linear Dynamic Model

This paper presents an approach to enhance speech feature estimation in the log spectral domain under additive noise environments. A switching linear dynamic model (SLDM) is explored as a parametric model for the clean speech distribution, enforcing a state transition in the feature space and capturing the smooth time evolution of speech conditioned on the state sequence. Experimental results u...

متن کامل

Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement

Performance of speech recognition systems strongly degrades in the presence of background noise, like the driving noise inside a car. In contrast to existing works, we aim to improve noise robustness focusing on all major levels of speech recognition: feature extraction, feature enhancement, speech modelling, and training. Thereby, we give an overview of promising auditory modelling concepts, s...

متن کامل

Dynamic Robust Speech Recognition

Robust recognition theory has become one of research focuses of acoustic speech recognition. Acoustic speech digital signal is a random process repeatedly alternating stationary pieces with non-stationary pieces. However both the current linear and stationary characteristic parameters drawn from such signals and the rigid recognition models do not adapt to such repeatedly alternating property o...

متن کامل

An approach to iterative speech feature enhancement and recognition

In this paper we propose a novel iterative speech feature enhancement and recognition architecture for noisy speech recognition. It consists of model-based feature enhancement employing Switching Linear Dynamical Models (SLDM), a hidden Markov Model (HMM) decoder and a state mapper, which maps HMM to SLDM states. To consistently adhere to a Bayesian paradigm, posteriors are exchanged between th...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008